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Metastasis is the result of stochastic genomic and epigenetic events leading to gene expression profiles that drive tumor 
dissemination. Here we exploit the principle that metastatic propensity is modified by the genetic background to generate 
prognostic gene expression signatures that illuminate regulators of metastasis. We also identify multiple microRNAs 
whose germline variation is causally linked to tumor progression and metastasis. We employ network analysis of global 
gene expression profiles in tumors derived from a panel of recombinant inbred mice to identify a network of co-expressed 
genes centered on Cnotl that predicts metastasis-free survival. Modulating Cnotl expression changes tumor cell metastatic 
potential in vivo, supporting a functional role for Cnotl in metastasis. Small RNA sequencing of the same tumor set revealed 
a negative correlation between expression of the MirllS/117 cluster and tumor progression. Expression quantitative trait 
locus analysis (eQTL) identified cw-eQTLs at the Mir216/2I7 locus, indicating that differences in expression may be 
inherited. Ectopic expression of Mirll6/I17'm tumor cells suppressed metastasis in vivo. Finally, small RNA sequencing and 
mRNA expression profiling data were integrated to reveal that miR-3470a/b target a high proportion of network 
transcripts. In vivo analysis of Mir3470a/b demonstrated that both promote metastasis. Moreover, Mir3470b is a likely 
regulator of the Cnotl network as its overexpression down-regulated expression of network hub genes and enhanced 
metastasis in vivo, phenocopying Cnotl knockdown. The resulting data from this strategy identify Cnotl as a novel regulator 
of metastasis and demonstrate the power of our systems-level approach in identifying modifiers of metastasis. 



[Supplemental material is available for this article.] 

Metastasis is a systemic disease responsible for the majority of 
cancer-related mortality and is influenced by both tumor cell-au- 
tonomous and host-derived factors. Its complexity is deepened by 
involvement of not only stochastic genomic and epigenetic events 
but also by inherited predisposition (Lifsted et al. 1998; Crawford 
et al. 2006). As a result, despite identification and characterization 
of individual genes, cellular and developmental processes associ- 
ated with metastasis, understanding of the metastatic cascade and 
the interconnectivity of individual factors remains limited. The 
elucidation of higher-order networks underlying metastasis will 
therefore likely improve prognostication and intervention strate- 
gies by identifying molecular nodes central to tumor cell dissem- 
ination and colonization. 

Recent advances in global gene expression profiling and 
computational science have provided the basis for understanding 
cancer biology at a systems level (Quigley et al. 2009). Knowledge 
of both tumor subtypes (Perou et al. 2000) and patient prognosis 
(van 't Veer et al. 2002) has been enhanced by systems-based ap- 
proaches. This knowledge may significantly change clinical prac- 
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tice by enabling the development of precision treatments based on 
molecular predictions of outcome and/or tumor response. How- 
ever, these advances, while significant from the clinical stand- 
point, are correlative and therefore do not directly address ques- 
tions about causality or the relationships between the individual 
genes within gene expression signatures. As such, these studies do 
not specifically interrogate the drivers of metastatic progression. 

Our laboratory has demonstrated that breast cancer not only 
has an inherited predisposition for metastasis (Park et al. 2005; 
Hsieh et al. 2009), but the polymorphisms that dictate metastatic 
susceptibility may also contribute to prognostic signatures (Lukes 
et al. 2009). This suggests that characterization of metastasis sus- 
ceptibility genes and the transcriptional networks affected by these 
inherited variants will be a valuable resource to visualize metasta- 
sis-associated networks, define critical nodes within the networks, 
and identify new candidate genes that underlie the metastatic 
cascade. 

In this study we utilized a recombinant inbred (RI) genetic 
reference panel of mice (Mucenski et al. 1986) as the framework to 
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globally interrogate transcriptional determinants of metastasis 
susceptibility. RI panels are specialized sets of inbred mice gener- 
ated from two inbred strains produced by 20 or more generations 
of brother-sister inbreeding (Fig. 1A). Each of the resulting sublines 
is a distinct inbred composite of two previously established pa- 
rental inbred mouse lines. RI panels are particularly useful for 
mapping inherited components for highly variable quantitative 
phenotypes, such as metastatic dissemination. Variation resulting 
from stochastic events can be reduced by phenotyping multiple 
isogenic individuals within each subline of the RI panel. This 
strategy results in a better estimation of genetic effects and in- 
creases power to detect genotype-phenotype associations com- 
pared with standard genetic mapping strategies, in which every 
individual animal is genetically unique. 

The AKXD RI panel was derived from the AKR/J and DBA/2J 
inbred strains (Mucenski et al. 1986). Previous work in our labo- 
ratory has demonstrated that, when crossed to highly metastatic 
male MMTV-PyMT transgenic mice (Guy et al. 1992), the progeny 
of these two strains exhibit >20-fold difference in metastatic pro- 
pensity (Lifsted et al. 1998). Crossing each AKXD subline to 
MMTV-PyMT transgenic mice (denoted henceforth as [AKXD n x 
PyMT]Fl) resulted in progeny with a similar range of variability in 
metastatic susceptibility across the AKXD panel (Fig. 1A). Since the 
oncogenic driver (PyMT) and paternal genetic background (FVB/ 
NJ) are identical among all progeny, the phenotypic diversity in 
metastatic susceptibility is most likely a result of germline poly- 
morphism in the maternal genome. As such, we selected [AKXD n x 
PyMT]Fl progeny as a genetically defined vehicle for identification 
of heritable metastasis-associated transcriptional networks. 

It has been demonstrated that microRNAs (miRNAs) play 
a key role in the initiation and progression of multiple tumor types 
(Tavazoie et al. 2008; Pencheva and Tavazoie 2013). Our group 
recently showed that heritable differences in miRNA expression 
underlie metastatic susceptibility (Goldberger et al. 2013). The 
concept that miRNAs are post-transcriptional regulators of gene 
expression led us to hypothesize that miRNAs regulate metastatic 



progression by targeting heritable metastasis-driving transcrip- 
tional networks. We thus devised an approach that combined 
mRNA and miRNA profiling to enable identification of co- 
expressed metastasis-driving transcriptional networks (Fig. IB) and 
miRNAs that potentially regulate these networks. 

This strategy identified Cnot2, a structural component of the 
CCR4-NOT transcriptional regulatory deadenylase complex, as a 
metastasis modifier gene, suggesting a role for RNA processing and 
degradation in tumor progression and malignancy. Furthermore, 
conelation analysis between metastasis susceptibility and microRNA 
expression data, eQTL analysis, and subsequent in vivo validation 
identified the Mir216a, Mir216b, Mir217 as inherited metastasis- 
suppressing loci (Fig. 1C). Finally, computational screening of the 
transcriptional network hubs for common miRNAs followed by 
experimental validation studies identified miR-3470a and miR- 
3470b as metastasis-enhancing miRNAs (Fig. ID). These findings 
demonstrate the utility of our integrated approach for elucidating 
previously unidentified factors that both positively and negatively 
influence metastatic progression. 

Results 

Correlation of gene expression and metastatic efficiency 

We have previously demonstrated that germline polymorphisms 
can modify gene expression patterns and result in differences in 
heritable metastatic susceptibility (Faraji et al. 2012; Hu et al. 2012; 
Winter et al. 2012). Based on this principle, we analyzed gene ex- 
pression profile data from two to seven tumors from each of the 
[AKXD n x PyMT]Fj progeny on Affymetrix MOE430 microarrays 
(Yang et al. 2005). Permutation tests of the gene expression profiles 
revealed 20 probe sets representing 1 7 distinct genes passing the 
genome-wide significance threshold for metastatic progression 
(P < 0.05) (Supplemental Table 1). The gene most strongly correlated 
with the metastatic phenotype was the CI protease cathepsin J 
(Ctsf), suggesting a possible role for protein degradation in meta- 
static progression. No obvious overrepresentation of any single 
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Figure 1. Genome-wide strategy to identify metastasis susceptibility co-expressed networks and their post-transcriptional regulators. (A) The pre- 
existing AKXD recombinant inbred (RI) panel was constructed by breeding metastasis-prone inbred mice from the AKR/J background to metastasis- 
resistant DBA/2J mice. F q progeny from the AKR/J and DBA/2J cross were intercrossed; F 2 progeny were bred to homozygosity, generating 24 isogenic 
AKXD sublines. Crossing each AKXD subline to the MMTV-PyMT model for highly metastatic breast cancer revealed phenotypic diversity in metastatic 
susceptibility within the AKXD RI panel. (8) RNA purified from AKXD subline primary tumors was subjected to global mRNA profiling by microarray 
followed by WCCNAand Kaplan-Meier Analysis to identify metastasis-associated co-expressed transcriptional modules. (C) AKXD subline tumor RNA was 
also subjected to small RNA sequencing followed by miRNA expression-phenotype correlations and eQTL analysis to identify metastasis susceptibility 
miRNAs. (D) Workflow illustrating analytic integration of global mRNA and miRNA profiling methods to uncover candidate post-transcriptional regulators 
of co-expressed networks. 
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functional pathway was observed among the 1 7 genes, nor were 
any of the previously identified metastasis susceptibility genes 
identified in this genetic reference panel. With the exception of 
Ms4a6d, none of the metastasis-associated genes were linked with 
regions of the genome previously shown to harbor metastasis 
susceptibility genes (Hunter et al. 2001; Lancaster et al. 2005; 
Crawford et al. 2008b; KW Hunter, unpubl.). 

To test the relevance to human disease, the human homolog 
to each gene was queried using the Gene expression-based Out- 
come for Breast cancer Online (GOBO, http://co.bmc.lu.se/gobo/) 
(Ringner et al. 2011). Of the 17 murine genes, 12 were represented 
by probes to the human homologs. Of these 12 genes, expression 
of seven (58%) segregated high- and low-risk groups with respect to 
distant metastasis-free survival (DMFS) (Supplemental Fig. 1). These 
data indicate that Cstfl, Rngtt, Areg, Umod, Pcp4, Slc25AS, and Tobl are 
mouse candidate metastasis-associated genes predictive of DMFS in 
humans. Intriguingly, Tobl and Cstfl are regulators of polyadenylyl 
tail length (Takagaki and Manley 2000; Ezzeddine et al. 2007) and 
Rngtt is an mRNA 5'-guanyltransferase capping enzyme (Pillutla et al. 
1998). To our knowledge, this is the first report implicating modu- 
lation of mRNA processing and stability in metastasis. 

Analysis of expression data reveals modulation of network 
expression by previously identified metastasis susceptibility 
genes 

We next undertook gene expression network analyses to move 
beyond single gene correlations and reveal higher-order expres- 
sion patterns driving metastatic susceptibility. To define our ap- 
proach, we proposed that modules of co-expressed genes can in- 
dicate co-regulated gene networks comprising functionally 
coherent molecular pathways (Tavazoie et al. 1999). Network-level 
analysis of expression data was achieved by examining gene ex- 
pression network modules using weighted gene co-expression 
network analysis (WGCNA) based on topological overlap measure 
(TOM) algorithms (Li and Horvath 2009). Assuming parsimonious 
network structure, the generated expression networks were visu- 
alized with minimum spanning tree algorithms. Thirteen modules 
containing between 41 and 1085 co-expressed genes were identi- 
fied (Supplemental Figs. 2-14). For clarity, modules were named for 
the most highly connected gene within each network. 

Encouragingly, we noted the membership of several pre- 
viously identified and in vivo validated metastasis modifier genes 
in our networks. The Ndn module was found to contain Arap3 
(previously known as Centd3), PH6, Ndn, and Csflr, two of which 
(Ndn and Arap3) occupied central, highly connected hub positions 
(Supplemental Fig. 2). These four genes were previously found to 
modulate metastatic efficiency in vivo (Crawford et al. 2008b). In 
addition, the Cnot2 module (Supplemental Fig. 3) was found to 
contain Arid4b and Luc7l while the Polr2g module (Supplemen- 
tal Fig. 7) contained Brd4 (Crawford et al. 2008a; Winter et al. 
2012). The membership of these bona fide metastasis modifier 
genes within our networks provided increased confidence for 
our systems-level approach to identifying metastasis-driving 
transcriptional networks. 

Network modules predict outcome in human breast 
cancer data sets 

We next turned to two human breast cancer gene expression data 
sets to determine if the individual transcription networks were also 
associated with human metastatic disease. GSE2034 (Wang et al. 



2005) and GSE1 1121 (Schmidt et al. 2008) are gene expression data 
sets from node-negative breast cancer patients who did not receive 
adjuvant therapy and for which DMFS data are available. The in- 
dividual mouse module gene sets were converted to human probe 
sets using the NetAffx Batch Query tool (Affymetrix.com). Probe 
sets representing "hub transcripts" (greater than or equal to five 
connections), which capture the majority of the module gene ex- 
pression variability (Hu et al. 2012), were used to generate gene 
signature to assess the relative impact of each module on DMFS 
by Kaplan-Meier analysis (Table 1). The individual Cnot2, KM12, 
Katnal, Chac2, Polr2g and modules significantly discriminated 
DMFS in both data sets, and are therefore the highest confidence 
metastasis-associated modules. The Sephsl and Naa35 modules 
were significant for GSE2034 and borderline for GSE11121, while 
the Rad51 and Traf7 modules were significant for GSE2034 but 
not GSE11121. Conversely, the Fabp4 and Eno3 modules were 
significant for GSE11121 but not GSE2034. The Ndn module was 
borderline significant for both, while the Gsk3b module was not 
significant for either data set (Table 1; Supplemental Figs. 3-28). 
The lack of concordance for these modules between the two data 
sets can be explained by statistical fluctuations, differences in the 
number of samples, or undefined differences in sample composi- 
tion in the breast cancer data sets. Since human expression data 
sets are thought to be significantly underpowered to detect stable 
gene signatures associated with survival (Ein-Dor et al. 2006), we 
believe that at least some of these modules play a role in metastatic 
disease and, thus, were included in subsequent computational 
analyses. 

Since each of these statistically derived modules may repre- 
sent a co-regulated transcription program, we next performed gene 
set enrichment analysis (GSEA) (Subramanian et al. 2005) to de- 
termine if any module acted to maintain specific cellular functions 
and to identify common underlying functional pathways that 
might be associated with metastatic progression. Six of the 13 
modules were found to be most highly associated with the nucleus. 
mRNA transport, zinc ion binding, and chromatin modification 
were also represented by more than one module (Table 1). To- 
gether, these results suggest that nuclear biology plays a role in 
establishment of metastatic predisposition. This idea is supported 
by previous identification of the metastasis susceptibility genes 
Brd4, a transcriptional elongation factor and chromatin reader, 
Rrplb, a facultative heterochromatin protein, Ndn, a transcription 
factor, and Arid4b, a member of the Sin3a histone deacetylase 
complex (Crawford et al. 2007; Crawford et al. 2008a,b; Winter 
et al. 2012). 

Validation of the Cnot2 module co-regulation 

The network showing the highest reproducible statistical correla- 
tion to DMFS, Cnot2 (Fig. 2A), was selected for biological valida- 
tion. We hypothesized that the module could be a co-regulated 
expression unit potentially influenced by the expression of Cnot2, 
which might serve as a surrogate gene for the network. To test this 
idea, nine network hub genes (Atgl2, Riokl, Krccl, Angel2, Polrmt, 
TmemlOl, Exocl, Utp6, Ube2r2) (highlighted in gray in Fig. 2A) 
were selected to capture maximum network variance. The ex- 
pression of these nine hub genes, representing 52% (114 of 217 
genes) of the connectivity of the entire module, was assessed by 
quantitative real time PCR (qRT-PCR). Ectopic expression of Cnot2 
in 6DT1 murine mammary tumor cells significantly up-regulated 
five of nine hub genes (Atgl2, Riokl, Krccl, Angel2, Polrmt) (Fig. 2B), 
indicating that overexpression of Cnot2 within physiological 
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Figure 2. A transcriptional module centered on Cnot2 shows regulation by Cnot2 expression and predicts outcome in human breast cancer cohorts. (A) 
The Cnot2 module. Circles indicate individual genes. Connections between the genes were generated by minimizing the number of connections necessary 
to explain the gene expression correlations. (8) The effect of Oiof2overexpression in 6DT1 cells on network hub transcripts. (*) P<0.05, (**) P<0.01 . (C) A 
gene signature generated from homologous Cnot2 module hub transcripts predicts survival in human breast cancer cohorts. (D) Oncomine data set (Finak 
et al. 2008) shows CNOT2 is down-regulated in invasive breast carcinoma relative to normal breast tissue (normal breast tissue n = 6, invasive breast 
carcinoma n = 53; fold change = -16.3; P= -1.12 x 10~ 24 ). 



limits (—2-3 fold) resulted in co-upregulation of greater than half 
of the tested genes within the Cnot2 module and suggesting that at 
least a subset of genes within the Cnot2 network is part of a regu- 
latory module potentially controlled by Cnot2 expression. 

In vivo validation of Cnot2 in metastatic progression 

The Cnot2 network signature stratified breast cancer patients into 
DMFS high- and low-risk groups (Fig. 2C). To further probe the 



functional role of variations in Cnot2 expression in tumor progression 
a search of publicly available human breast cancer microarray data 
sets using Oncomine (Compendia Bioscience) was performed, re- 
vealing that CNOT2 expression was 16-fold lower in 53 invasive 
breast carcinoma samples compared with six normal breast tissue 
samples (Fig. 2D; Finak et al. 2008). Also, comparison of 67NR and 
4T1 cells, clones with distinct metastatic potential isolated from the 
same spontaneously formed murine mammary tumor (Aslakson and 
Miller 1992), showed that Cnot2 along with four module hub genes 
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were down-regulated in highly metastatic 4T1 cells relative to 
nonmetastatic 67NR cells (Supplemental Fig. 29). Taken together, 
these data show a consistent negative correlation between Cnot2 
expression and tumor progression in independent mouse breast tu- 
mor model systems and human patient data. 

To test if variation in Cnot2 expression has a direct, causative 
role in tumor progression, 6DT1 cells overexpressing Cnot2 were 
orthotopically implanted into syngeneic immune competent FVB/ 
NJ female mice. Assessment at 30 d post-implantation demon- 
strated a reduction in primary tumor mass and a pronounced 
suppression of pulmonary metastases (Fig. 3A). 6DT1 cells were 
also transduced to stably express short hairpin RNA constructs 
targeting Cnot2, achieving modest knockdown of Cnot2. Cnot2 
knockdown resulted in a significant increase in pulmonary me- 
tastasis but had no effect on primary tumor burden in our model 
(Fig. 3B,C). To increase confidence in the generalizability of our 
results, the orthotopic implantation assay was repeated on an 
independent murine mammary tumor cell line, Mvtl . Knockdown 
of Cnot2 in Mvtl cells enhanced both pulmonary metastasis 
and primary tumor mass (Supplemental Fig. 30), consistent with 
the activity of a progression gene in this cell line. These results 
validate the causative role of Cnot2 as a negative regulator of tumor 
progression. 

Identification of inherited metastasis susceptibility microRNAs 

By definition, heritable metastasis modifier miRNAs display allelic 
variations in expression that dictate differences in metastatic sus- 
ceptibility (Goldberger et al. 2013). We proposed that the highest 
confidence heritable miRNAs would contain ris-expression quan- 
titative trait loci (eQTL), as these are most likely to result from 
heritable variations that directly and locally influence transcript 
levels (Doss et al. 2005). In order to uncover such miRNAs we 
undertook high throughput sequence analysis of small RNAs de- 
rived from [AKXD n x PyMTJFj tumors (Fig. 1C) followed by eQTL 
analysis (Schadt et al. 2003) in the AKXD RI panel. 



Comparison of expression levels of significantly expressed 
miRNAs to the genetic map resulted in the detection of seven 
distinct eQTLs for 13 miRNAs exceeding the genome-wide statis- 
tical threshold for metastatic progression (Table 2). Interestingly, 
the human homologs of eight of these 13 miRNAs (61%; miR-671, 
miR-99a, miR-140, miR-582, miR-186, miR-216b, miR-23a, and 
miR-544) have previously been implicated in metastatic dissemi- 
nation (Dengetal. 2011; Haga and Phinney 2012;Jahidetal. 2012; 
Rutnam and Yang 2012; Kuo et al. 2013; Li et al. 2013a,d; Uchino 
et al. 2013; Yang et al. 2013a). Further, the highest confidence 
eQTL observed was for miR-200c*, a poorly characterized member 
of the Mir200c/141 cluster, which is thought to drive metastatic 
dissemination by inducing tumor cells to undergo epithelial-to- 
mesenchymal transition (Gregory et al. 2008). 

The sequence data were then queried for miRNAs whose ex- 
pression directly correlated with metastatic propensity across the 
AKXD RI panel. miR-216a, miR-216b, and miR-217 showed sig- 
nificant inverse correlations to metastasis (Fig. 4A), indicating that 
inherited variation in the expression of this set of miRNAs contrib- 
utes to differences in metastatic susceptibility across the AKXD panel. 

Comparison of human and mouse Mir216/217 sequences 
revealed a high degree of conservation of pre-miRNA and mature 
miRNAs, with 100% sequence identity for mature miR-216a and 
miR-217 and 96% identity for mature miR-216b (Supplemental 
Fig. 32). In addition, the target prediction tool TargetScan (Lewis 
et al. 2005) showed significant overlap in targets in mice and 
humans (Supplemental Fig. 33). At least one EST (DA732292 in 
humans) encompasses all three miRNAs, suggesting this miRNA 
cluster is expressed from the same primary transcript (Supple- 
mental Fig. 34), which raised the possibility that repression of this 
transcript may cause a synergistic enhancement of tumor cell 
metastatic potential. To test this idea, the Broad Institute miRNA 
target prediction tool was applied to predict and combine all miR- 
216/217 putative targets. A gene signature was then developed 
from predicted targets that showed significant variance in two 
independent mouse tumor data sets (Diversity Outcross and MOLF 
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Figure 3. Cnot2 expression suppresses tumor progression. (A) Overexpression of Cnot2 in 6DT1 cells showed 22% reduction in primary tumor mass (P = 
0.021) and 46% reduction in pulmonary metastases (P = 0.006, n = 20 per group). (B) Short hairpin-mediated knockdown Cnot2 by 51% (sh62) and 42% 
(sh64) in 6DT1 cells resulted in no significant change in primary tumor mass but enhanced metastatic potential respectively by 1 83% (P = 0.002) and 87% 
(P = 0.026, n 

control — 18 mice, n sn 62 — 10 mice, n sn 64 — 9 mice). (C) Representative lungs from knockdown experiment sectioned and stained with 
hematoxylin and eosin (H&E). (*) P < 0.05, (**) P < 0.01 . 
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Table 2. miRNA eQTLs 



miRNA eQTL 



miRNA Chromosome Position (Mb) Locus Chromosome Position (Mb) LOD score Permutation P-value 



miR-200c* 


6 


124.7 


rs6248036 


5 


21.5 


5.01 


0 


miR-671-3p 


5 


24.1 


rs41 88505 


16 


23.9 


5.25 


0.002 


miR-331-5p 


10 


93.4 


rs6248036 


5 


21.5 


5.6 


0.004 


miR-217 


11 


28.7 


rsl 3480929 


11 


14 


6.71 


0.006 


miR-99a 


16 


77.6 


rs3657839 


8 


60.5 


4.98 


0.011 


miR-216a 


11 


28.7 


rsl 3480929 


11 


14 


7.16 


0.013 


miR-140 


8 


110.1 


rs6352812 


9 


44 


5.45 


0.016 


miR-582-5p 


13 


110.1 


rs3657839 


8 


60.5 


4.33 


0.018 


miR-186 


3 


157.2 


rs3657839 


8 


60.5 


4.72 


0.02 


miR-216b 


11 


28.7 


rsl 3480929 


11 


14 


5.4 


0.025 


miR-23a 


8 


86.7 


rsl 3476241 


1 


85.8 


4.34 


0.032 


miR-376c 


12 


111 


rs630061 3 


8 


62.4 


4.53 


0.041 


miR-544 


12 


111 


rs3660779 


2 


29.5 


14 


0.05 



[Hu et al. 2013], Supplemental Table 15). This miR-216/217 target 
gene signature demonstrated statistically significant stratification 
of metastatic risk in murine mammary tumors as well as in human 
breast cancer (Supplemental Figs. 35, 36). Interestingly, the human 
data demonstrated statistically significant stratification of meta- 
static risk specifically in estrogen receptor (ER) negative tumors. 

To test the direct causative role of miR-216a, miR-216b, and 
miR-217 in breast cancer, 6DT1 cells were transduced to stably 
express pre-miRNA sequences (Fig. 4C). As shown in Figure 4D, 
orthotopic implantation of 6DT1 cells ectopically expressing each 
miRNA resulted in minimal effect on primary tumor mass although 
expression of Mir216b and Mir217 trended toward reduction of 
primary tumor growth. Despite insignificant effect on tumor 
growth, Mir216b and Mir217 expression in 6DT1 cells resulted in 
dramatic suppression of pulmonary metastasis (Fig. 4D). 

The Mir216/217 cluster predicted target signature stratified 
metastatic risk in basal-like murine tumors (Hu et al. 2013) as well 
as ER-negative human tumors. We thus also tested the Mir216a, 
Mir216b, and Mir21 7 expression in the 4T1 cell line, an ER-negative 
murine tumor model (Kaur et al. 2012). Orthotopic implantation 
of 4T1 cells stably expressing Mir216a, Mir216b, or Mir21 7 showed 
statistically significant tumor suppression with a dramatic re- 
duction in tumor mass for 4T1 cells expressing Mir216a and a more 
moderate tumor suppressive effect for cells expressing Mir216b 
(Supplemental Fig. 37). In 4T1 cells, all three miRNAs showed 
statistically significant inhibitory effects on metastasis. Despite 
observed differential effects on primary tumor growth between 
6DT1 and 4T1 cells, expression of the Mir216/217 cluster showed 
similar suppressive effects on pulmonary metastasis. This di- 
rectional concordance between genetic associations derived 
from sequence analysis in AKXD mice with in vivo metastasis 
assays in two independent mammary tumor cell lines indicates 
that the Mir2 16/217 cluster is a heritable suppressor of tumor 
progression. 

Integrated analysis of gene expression profiling 
and miRNA expression data 

To gain insight into post-transcriptional regulation of the co- 
expressed modules associated with metastasis susceptibility, we 
undertook an approach that integrated our mRNA and small RNA 
profiling data. We hypothesized that querying transcripts for the 
presence of common microRNA recognition elements (MREs) 
might yield metastasis master regulator miRNAs (Fig. ID). 



We searched within all networks generated from expression 
profiling of [AKXD x PyMTJFi tumors, focusing on genes that 
capture the majority of the network variability by restricting 
analysis to hub transcripts. A gene signature was constructed 
from the 153 hub transcripts (Supplemental Table 16) and 
screened for the ability to discriminate patient outcome. This 
gene signature was found to discriminate patient outcome in 
both GSE2034 and GSE11121 data sets (Supplemental Fig. 38), 
suggesting that these 153 hub transcripts were significantly 
associated with DMFS. 

Both mRNA and miRNA genome-wide screens were per- 
formed on RNA from the same tumors. The data captured by 
mRNA expression array and miRNA sequencing could thus be in- 
tegrated for genome-wide query of potential miRNA-mRNA target 
pairs by identifying negatively correlated miRNA-gene pairs. 
Correlations with a P-value s 0.01 after 10,000 permutations were 
considered significant, producing a total of 16,475 miRNA-gene 
pairs consisting of 369 miRNAs in negative correlation with 4153 
genes (Supplemental Table 17). The resulting data were then sub- 
jected to three filters. First, to increase the probability of identify- 
ing metastasis-driver microRNAs, all subsequent analysis was re- 
stricted to the 153 metastasis-associated hub transcripts. Second, 
the results were filtered for genes with shared computationally 
predicted MREs. Third, those miRNAs targeting the filtered set of 
mRNAs more frequently than predicted by chance were selected as 
candidate metastasis-driving miRNAs. 

The first two filters yielded 38 miRNAs putatively targeting 49 
of the 153 highly connected hub genes (Supplemental Table 18). 
Of these 38 candidate miRNAs 25 have known human homologs, 
of which 1 1 have previously been identified to directly act in the 
metastatic cascade in various tumor types (let-7f, miR-30c, miR-31, 
miR-127, miR-134, miR-140, miR-148a, miR-196b, miR-210, miR- 
503, miR-582) (Valastyan et al. 2009; Li et al. 2010; Liang et al. 
201 1; Ying et al. 201 1; Zhou and Wang 201 1; Bockhorn et al. 2013; 
Li et al. 2013c; Liu et al. 2013; Uchino et al. 2013; Yang et al. 
2013a,b) while several other miRNAs have been associated with 
distant metastatic disease (miR-152, miR-298, miR-326, miR-423) 
(Farazi et al. 2011; Bao et al. 2012; Hui et al. 2013; Valencia et al. 
2013). Moreover, as shown above, miR-140 and miR-582 were 
also found to contain a trans-eQTL (Table 2) implicating them as 
candidate heritable metastasis susceptibility miRNAs, which is 
consistent with the notion that these miRNAs target heritable 
metastasis susceptibility transcriptional networks. In total, 15 
(60%) of 25 human homologs of murine miRNAs identified by 
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Figure 4. Mir216/217 are inherited metastasis suppressors. (A) Expression-phenotype correlations for Mir216a, Mir216b, and Mir217 in [AKXD n x 
PyMTJF! tumors. Blue, yellow, and red points indicate AKXD sublines with respective weak, intermediate, and high metastatic propensities. (8) miR-eQTL 
analysis demonstrated peaks reaching genome-wide significance (red line)for M/r27 6a and Mir21 7 and reaching the "suggestive" threshold (gray line) for 
Mir216b. Consistent with a c/s-eQTL, these peaks co-localize to the Mir216/21 7 locus on chromosome 1 1 . Please refer to Supplemental Figure 31 for 
a detailed, high-resolution format of eQTL plots. (C) Ectopic expression of Mir216a, Mir21 6b, and Mir21 7 in 6DT1 cells. (D) Orthotopic implantation of 
6DT1 cells expressing control miRNA, Mir216a, Mir216b, or Mir21 7. M/r21 6b and Mir2 1 7 expression reduced pulmonary metastasis by 79% and 80%, 
respectively (p M Mjeb = 0.01 1 , 

PMir2i7 — 0.022; n = 1 0 mice per group). (*) P < 0.05. 
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integrating transcriptional network analysis with small RNA se- 
quencing data have previously been implicated in metastasis. 

Unexpectedly, the third filter revealed that 32 (65%) of the 
49 target genes were predicted targets of miR-3470a or miR- 
3470b, suggesting that this miRNA family plays an important 
role in the establishment of metastatic susceptibility. Fourteen 
genes were predicted targets for both miR-3470a and miR-3470b 
and each miRNA targeted two hubs in the Cnot2 network 
(Supplemental Fig. 39). miR-3470a and miR-3470b were there- 
fore selected for validation as candidate metastasis-driver 
microRNAs. 

In vivo validation of miR-3470a and miR-3470b 
as mouse-specific promoters of tumor progression 

We initially turned to 67NR and 4T1 cells, two cell lines with dis- 
tinct differences in metastatic potential, to verify the biological 
role of Mir3470a and Mir3470b in metastasis. Measurement of 
Mir3470a and Mir3470b expression by miR-Taqman qRT-PCR 
revealed that these miRNAs were more highly expressed in highly 



metastatic 4T1 cells than nonmetastatic 67NR cells (Supplemental 
Fig. 40), indicating that these miRNAs may be pro-metastatic. 

As all previous in vivo analyses and Cnot2 network validation 
were conducted in 6DT1 cells, we next turned this cell line to test 
direct, causative effects of Mir3470a/b on network structure, gene 
expression, and tumor cell metastatic potential. We generated 
6DT1 cells stably expressing Mir3470a and Mir3470b and con- 
firmed Mir3470a/b expression by qRT-PCR, demonstrating seven- 
fold overexpression of Mir3470a and 88-fold overexpression of 
Mir3470b (Fig. 5A). Assessment of hub transcript expression levels 
upon Mir3470a overexpression demonstrated that three (Chac2, 
Dnpep, ArfrpI) of the nine tested hub transcripts predicted to be 
miR-3470a/b targets were down-regulated upon miR-3470a over- 
expression while seven (Ap2a2, Insig2, Chac2, Ddx46, Dnpep, Cnot8, 
ArfrpI) of the nine tested hub transcripts were down-regulated upon 
miR-3470b overexpression (Supplemental Fig. 41). Since miR- 
3470a and miR-3470b possess the same seed sequence — the 5'-end 
miRNA sequence thought to contribute most strongly to miRNA 
recognition of target transcripts — it is possible that the 10-fold 
difference in expression between Mir3470a- and Mir3470b- 
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Figure 5. Mir3470a and Mir3470b promote metastatic progression. (A) Stable transduction of Mir3470a and Mir3470b in 6DT1 cells. (6) Orthotopic 
implantation of 6DT1 cells overexpressing Mir3470a showed 40% increase in tumor mass (P = 0.014) and 230% increase in pulmonary metastasis (P = 
0.0069). Overexpression of Mir3470b resulted in 21 3% increase in pulmonary metastasis (P = 0.088; n contr0 | = 9 mice, n Mir3470a = 10 mice, n Mir _3 470b = 1 0 
mice). (C) Representative H&E stained lung sections from mice implanted with control or cells overexpressing Mir3470a/b. (*) P < 0.05, (**) P < 0.01 . 
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A. 



expressing cells explains the difference in predicted target down- 
regulation. 

In addition, Mir3470a and Mir3470b overexpression down- 
regulated multiple network hub genes within the Cnot2 module 
(Cnot2, TmemlOl, Utp6, Exocl, Polrmt, Ube2r2, and Angel2), only 
some of which are predicted to be miR-3470a/b targets (Supple- 
mental Fig. 42). Intriguingly, Mir3470b overexpression appeared to 
have a unidirectional effect on previously identified metastasis 
susceptibility genes, up-regulating the pro-metastatic Brd4 short 
isoform (Brd4-SF) while down-regulating anti-metastatic genes: 
Brd4 long isoform (Brd4-LF), Rrplb, and 
Cadml (Supplemental Fig. 43; Crawford 
et al. 2007, 2008a; Alsarraj et al. 2011; 
Faraji et al. 2012). Consistent with the 
correlative data in 67NR and 4T1 cells, the 
negative regulatory effect of Mir3470a/b 
on Cnot2 module transcripts and the 
changes in expression of previously iden- 
tified metastasis susceptibility genes were 
suggestive of a pro-metastatic role for 
Mir3470a/b. 

We next implanted Mir3470a or 
Mir3470b overexpressing 6DT1 cells into 
syngeneic mice to test the direct effect of 
miR-3470a/b on tumor progression and 
metastasis. Orthotopic implantation of 
Mir3470a overexpressing 6DT1 cells into 
mammary fat pads of immune competent 
FVB/NJ mice resulted in mild but statis- 
tically significant enhancement in tumor 
mass and a dramatic increase in pulmo- 
nary metastases while Mir3470b over- 
expressing 6DT1 cells showed specific 
promotion of metastatic colonization 
(Fig. 5B,C). Consistent with gene expres- 
sion correlation studies in 67NR and 4T1 
cells, as well as expression impact on the 
Cnot2 module, these in vivo results con- 
firm Mir3470a as a promoter of tumor 
progression, and identify Mir3470b as 
a metasta-miR (Hurst et al. 2009). 



for unbiased, genome-wide isolation of all biologically relevant 
co-expressed transcriptional modules. Each module was then si- 
multaneously tested for association with metastasis and human 
clinical significance by deriving module gene signatures and ap- 
plying them to independent expression-level patient data sets of 
node-negative nonadjuvant treated breast cancer with associated 
clinical endpoints. Our strategy, based on mRNA and miRNA 
expression in primary tumors, identified both metastasis-driving 
genes and miRNAs, further supporting a paradigm in which 
transcription programs within the primary tumor are deter- 




B. 



Discussion 

The systems genetics analysis performed 
here provides a view of the transcriptional 
networks and corresponding post-tran- 
scriptional regulatory mechanisms driv- 
ing metastatic breast cancer. Rather than 
just attempting to identify single genes 
acting in metastasis, we undertook a 
higher-order approach to gain insight 
into the transcriptional profiles poten- 
tially driving metastatic susceptibility by 
emphasizing connectivity and transcript- 
level interactions (Fig. 6A). Networks of 
co-expressed genes derived from global 
expression profiling of [AKXD n x 
PyMTJFj primary tumors were identified 
agnostic to the metastatic phenotypes 
observed in the parental strains or RI 
panel sublines. This approach allowed 
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Figure 6. Integration of systems genetics analysis. (A) Circos integration of global transcript and 
miRNA profiling analyses. From periphery inward: clockwise proximal to distal chromosomal posi- 
tioning (blue), (1) mRNAs showing statistically significant eQTL peaks (red peaks), (2) mRNAs with 
expression significantly associated with metastatic propensity (blue peaks), (3) hub transcripts 
encompassed within eQTL intervals (black peaks), (4) miRNAs encompassed within miR-eQTL intervals 
(green peaks), (5) miRNAs with metastasis-associated expression (pink peaks), (7) miRNAs predicted to 
target highly connected network hubs (blue line links miRNA and its predicted target mRNA, red point 
indicates mRNA position). Genes labeled in purple indicate module central nodes. Genes labeled in 
green indicate hub transcripts with predicted miR-3470b MREs that are down-regulated upon Mir3470b 
overexpression. miRNAs shown to have a causal role in metastasis are indicated in blue. Peak lengths are 
in -loglO P-value. Please refer to Supplemental Figure 44 for a detailed high-resolution version. (6) 
Proposed model for murine metastatic progression. Global genome hypomethylation associated with 
neoplastic transformation results in the DNA demethylation and transcriptional activation of short in- 
terspersed elements (SINEs) including Mir3470b. Mir3470b expression down-regulates metastasis 
suppressive genes and the metastatic progression inhibitory Cnot2 network while concurrently up- 
regulating the metastasis promoting Brd4-SF, driving metastatic progression. 
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minants of metastatic progression (van de Vijver et al. 2002; 
Ramaswamy et al. 2003). In addition to describing a novel ap- 
proach to the identification of disease-specific heritable suscep- 
tibility transcription modules, the networks themselves provide 
a resource for selecting and prioritizing genes and miRNAs for 
analysis, based on position, connectivity, and number of poten- 
tial target genes. 

The Ndn module, for example, incorporated four previously 
identified metastasis susceptibility genes (Ndn, Pil6, Arap3, Csflr). 
The Cnot2 module also contained at least four previously identified 
regulators of tumor progression: the metastasis susceptibility gene, 
Arid4b, its protein interaction partner, Sin3a, Luc7l, and the anti- 
proliferative cell cycle component Btg2 (Crawford et al. 2008b; 
Takahashi et al. 2011; Winter et al. 2012; Das et al. 2013). Our 
laboratory validated the direct role of both Arid4b and Luc7l using 
genetic and in vivo modeling of metastasis (Crawford et al. 2008b; 
Winter et al. 2012). Since expression modules were identified by 
unsupervised clustering of array data, the incidental membership 
of bona fide metastasis modifier genes within these networks 
provide another measure of confidence in our method for identi- 
fying potential transcription networks that may drive metastatic 
susceptibility. Based on the reproducibility of the prognostic power 
of the Cnot2 module we chose to investigate Cnot2 as both a po- 
tential metastasis modifier gene and a factor contributing to the 
establishment of the transcriptional network module. 

Cnot2 is a core component of the CCR4-NOT complex, 
a multifunctional protein complex that is highly conserved in 
Eukarya. It has been reported to have transcriptional regulatory, 
deadenylase, ubiquitin ligase, and chromatin modifying activities 
(Collart and Panasenko 2012). It is additionally thought to play 
an essential role in miRNA-mediated transcript repression by desta- 
bilizing miRNA-bound target transcripts (Behm-Ansmant et al. 
2006). CCR4-NOT complex subunit 2, Cnot2, was originally iden- 
tified as a negative regulator of transcription in yeast (Collart 
and Struhl 1994). Although Cnot2 has not been reported to have 
any enzymatic activity, it has been shown to be necessary for the 
CCR4-NOT deadenylase activity in mammalian cells (Ito et al. 201 1). 
Moreover, studies indicate the contribution of Cnot2 to cellular 
viability (Ito et al. 2011) as well as differentiation state (Zheng et al. 
2012), both thought to play critical cellular functions in the meta- 
static cascade (Luzzi et al. 1998; Kouros-Mehr et al. 2008). In yeast 
Cnot2 was observed to repress specific gene transcription in a his- 
tone deacetylase-dependent manner (Jayne et al. 2006) as well as to 
promote resumption of transcriptional elongation likely by bind- 
ing directly to RNA polymerase II (Kruk et al. 2011). Intriguingly, 
Cnot2 has also been reported to physically interact with the me- 
tastasis-associated proteins BRD4 (which incidentally is a member 
of the Polr2g module) and TPX2 (Crawford et al. 2008a; Lau et al. 
2009; Hu et al. 2012), raising the possibility that these proteins act 
in complex to modify tumor cell metastatic potential. All of these 
activities and associations are under current investigation as po- 
tential mechanisms that contribute to the regulation of the Cnot2 
module and C«of2-mediated metastatic suppression as observed in 
the present study. Nonetheless, the identification of Cnot2 as an in- 
hibitor of tumor progression represents a significant finding as it links 
an entirely novel protein complex as a determinant of metastasis. 

Further, we show by two distinct methods that miRNAs have 
functional regulatory roles in networks of heritable metastasis 
susceptibility: (1) Metastasis-suppressing miRNAs possess cis- 
eQTLs, indicative of heritable miRNA expression levels, and (2) 
miRNAs can target heritably derived metastasis-driving transcrip- 
tional networks. Two independent analyses identified several 



novel heritable metastatic susceptibility miRNA genes (Mir216a, 
Mir216b, Mir217, Mir3470a, Mir3470b). The discrepant effects on 
tumor progression of the Mir216/217 cluster between 6DT1 and 
4T1 cells is under investigation but is likely attributable to differ- 
ences in expressed genes between these cell lines. 6DT1 cells were 
derived from a MMTV-MTC transgenic tumor (Pei et al. 2004) with 
a gene expression profile resembling luminal-like human breast 
tumors (Lim et al. 2010) while 4T1 cells are derived from a spon- 
taneous tumor in BALB/c mice with an expression profile re- 
sembling basal-like human breast tumors (Kaur et al. 2012). As 
such, 6DT1 and 4T1 cells contain major transcriptomic differences 
0E Green, unpubl.) and therefore likely express different subsets of 
miR-216/217 target transcripts, which could account for differ- 
ences observed in tumor progression phenotypes. Our results are 
consistent with the notion that these miRNAs regulate numerous 
genes that promote tumor progression and which may be 
expressed differentially in different metastatic tumors. Notably, 
miR-216b and miR-217 have previously been implicated as in- 
hibitors of tumor growth and invasion in nasopharyngeal, renal 
clear cell, and pancreatic ductal carcinoma (Deng et al. 2011; Ali 
et al. 2012; Li et al. 2013b). Further, each member of the Mir216/ 
217 cluster was demonstrated to possess a significant ris-acting 
eQTL, indicating that differences in Mir2 16/217 expression across 
the AKXD RI panel are inherited and not acquired somatically 
during neoplastic transformation or tumor progression. 

In contrast miR-3470a/b were selected to validate our com- 
bined small RNA sequencing and mRNA profiling approach. We 
surmised that the relative overrepresentation of miR-3470a/b tar- 
gets in our transcription networks would render these miRNAs the 
most powerful in testing the validity of our miRNA-mRNA in- 
tegrated network approach. Although miR-3470a/b were identified 
by target prediction analysis by combining all metastasis-associated 
network transcripts, their role in metastasis may be in part medi- 
ated by direct or indirect targeting of previously identified metas- 
tasis susceptibility genes (Brd4, Rrplb, Cadml), genes within the 
Cnot2 network, genes within all networks constructed here, or a 
combination of the above. Despite no conserved human homolog 
to these murine miRNAs, their effect on transcriptional networks 
conserved among mice and humans validates our approach of 
integrating network analysis and miR-seq data and provides another 
line of evidence for the potential functional role of our statistical 
correlation-based expression networks in tumor progression. 

A closer look at the genomic context of Mir3470a/b reveals 
that both are degenerated Bl-type short interspersed nuclear ele- 
ments (SINEs), which are thought to be predominantly silenced in 
adult cell types largely by DNA methylation (Su et al. 2012). This is 
consistent with the initial report identifying miR-3470a and miR- 
3470b showing that its expression is restricted to the gonads (Ahn 
et al. 2010). There is also accumulating evidence supporting ob- 
servations of global DNA hypomethylation in cancer (Feinberg 
and Vogelstein 1983), with the most comprehensive genome- 
wide, nucleotide-level data available for breast cancer (Hon et al. 
2012), and that the resulting hypomethylation leads to expression 
of repetitive elements (Florl et al. 1999). Given these observations, 
we propose that murine tumorigenesis leads to hypomethyla- 
tion and expression of repetitive elements including Mir3470a/b, 
which drive tumor progression and metastasis (Fig. 6B). It remains 
unclear whether hypomethylation is a somatic or heritable event, 
but the strength of the effect by which Mir3470a/b can drive tumor 
progression may depend on heritable expression levels of metas- 
tasis susceptibility networks described in the current study and/or 
previously described susceptibility genes. 
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The networks described here represent statistically associated 
sets of expressed genes and therefore are likely partial representa- 
tions of the true transcriptional structure underlying metastatic 
disease. Like the prognostic gene signatures, the exact structure 
and membership of the metastasis-associated transcriptional net- 
work will likely change somewhat as additional samples and 
analyses are performed (Ein-Dor et al. 2006). Encouragingly, sig- 
nificant overlap of seven of the AKXD modules has been observed 
in TOM analysis of an independent mapping cross (P-values 
0.001-6 x 10~ 12 ) (K Hunter, Y Hu, J Zhang, unpubl.), indicating 
that at least some portion of the network will be stable. These re- 
sults also suggest that a similar analysis performed on a much 
larger sample set, derived for example from the use of the new 
Collaborative Cross mouse genetic mapping resource (Churchill 
et al. 2004), will improve this higher-order analysis of the suscep- 
tibility mechanisms associated with metastatic breast cancer and 
identify novel and more robust targets for clinical intervention. 
Although significant work remains to understand the details un- 
covered in this study, the observations presented here not only 
support our strategy of identifying higher-order transcriptional 
structure but also highlight that such approaches are essential for 
acquiring a broad-scale understanding of tumor progression. 

Methods 

Gene expression and small RNA sequencing 

Affymetrix MOE430 gene expression analysis was performed as 
previously described (Yang et al. 2005). Total RNA was isolated 
from two to four tumors from each of the [AKXD x PyMT]F! 
outcrosses; due to availability constraints of AKXD RI panel ani- 
mals, only 18 of the 24 sublines were crossed to MMTV-PyMT 
mice. Where possible the same tumors used for the MOE430 ex- 
pression analysis were selected for small RNA sequencing. Total 
RNA was extracted from the mammary tissues using the mirVana 
miRNA Isolation Kit (Ambion). Each RNA sample was applied to 
the Nanodrop 1000 Spectrophotometer (Thermo Scientific) to 
calculate sample concentration and purity. All samples displayed 
a 260/280 ratio >2.0 with a small portion having a 260/230 ratio 
<2.0. Samples were further analyzed for integrity using the Agilent 
2100 Bioanalyzer. Equal amounts of total RNA were pooled for 
each of the genotypes within the AKXD RI x PyMT panel then size 
selected for small RNA species before sequencing. The NCI Next 
Generation Sequencing Core performed enrichment of small RNA 
and sequencing on Illumina GA IIx. 

Network analysis 

We used an R package "weighted correlation network analysis 
(WGCNA)" (Langfelder and Horvath 2008) to find clusters (mod- 
ules) of highly correlated genes on AKXD mRNA expression array 
data sets: GSE30864, GSE30865, GSE30866, GSE31223. The net- 
work for each module was generated with the minimum spanning 
tree with a dissimilarity matrix from WGCNA. Hub transcripts 
were determined by links (degree of freedom) >5. This study uti- 
lized the high-performance computational capabilities of the Bio- 
wulf Linux cluster at the National Institutes of Health, Bethesda, 
MD (http://biowulf.nih.gov). 

eQTL analysis 

eQTL analysis was performed with R package qtl. Expression levels 
of all significantly expressed miRNAs were compared with the 



genetic map using an R/QTL package (Broman et al. 2003) and 
genome-wide significance levels determined by a bootstrap per- 
mutation test (1000 permutations) using both the permutation tail 
probability test and the one-sided P-value test. The permutation 
tests were based on resampling with samples drawn without re- 
placement from the pooled sample set. The observed value T(obs) 
of the test statistic was calculated and the observations/samples 
were pooled. Next the set of the statistic values were calculated by 
resampling each sample. For example, for 10,000 permutations, 
9999 statistic values were calculated. Finally, the one-sided P-value 
of the test was then calculated as the proportion of sampled per- 
mutations where the difference in means was greater than or equal 
to T(obs). The genome-wide statistical threshold was set at 0.05. 

MiRNA-phenotype correlation analysis was conducted using 
the R package glm. 

Correlation of microRNA abundance with gene expression 

Correlation analysis between the expression of microRNA and 
targeted genes was performed for all pairs of genes and micoRNAs, 
then filtered by the genes predicted as the putative targets of 
a particular microRNA. Targets were predicted from three of four 
databases: PITA (Witkos et al. 201 1), miRBase (Griffiths-Jones et al. 
2006), TargetScan, and miRanda (Miranda et al. 2006). The asso- 
ciation of the expression between miRNAs and their target genes 
was analyzed with R packge^/m. The significant associations were 
determined by a bootstrap permutation test (10,000 times and 
P-value < 0.01). For genes whose expression significantly correlated 
with microRNA expression, a gene set enrichment analysis (GSEA) 
was performed by a hypergeometric test in R. 

Animal studies 

Female FVB/NJ or Balb/cJ mice from Jackson Laboratories were 
injected at 6-8 wk of age. Two days prior to orthotopic injections, 
cells were placed in nonselective media. On the day of injection, 
1 x 10 s cells were injected orthotopically into the fourth mam- 
mary fat pad of age-matched, syngeneic, immune competent vir- 
gin females. After 30 d the mice were euthanized by intraperitoneal 
injection of 1 mL Tribromoethanol with subsequent cervical dis- 
location. Primary tumors were resected, weighed, and snap-frozen 
in liquid nitrogen. Lungs were resected, surface metastases were 
counted; lungs were inflated with 10% nitrate-buffered formalin 
and sent for sectioning and staining. All procedures were per- 
formed under the Animal Safety Proposal (LCBG-004) and approved 
by the NCI-Bethesda Animal Care and Use Committee. 

Data access 

The AKXD small RNA sequencing data from this study have been 
submitted to the NCBI Gene Expression Omnibus (GEO; http:// 
www.ncbi.nlm.nih.gov/geo/) under accession number GSE50179. 
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